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Abstract 

Background: Retention of a subset of introns in spliced polyadenylated mRNA is emerging as a frequent, 
unexplained finding from RNA deep sequencing in mammalian cells. 

Results: Here we analyze intron retention in T lymphocytes by deep sequencing polyadenylated RNA. We show a 
developmentally regulated RNA-binding protein, hnRNPLL, induces retention of specific introns by sequencing RNA 
from T cells with an inactivating Hnrpll mutation and from B lymphocytes that physiologically downregulate Hnrpll 
during their differentiation. In Ptprc mRNA encoding the tyrosine phosphatase CD45, hnRNPLL induces selective 
retention of introns flanking exons 4 to 6; these correspond to the cassette exons containing hnRNPLL binding sites 
that are skipped in cells with normal, but not mutant or low, hnRNPLL. We identify similar patterns of hnRNPLL- 
induced differential intron retention flanking alternative exons in 14 other genes, representing novel elements of 
the hnRNPLL-induced splicing program in T cells. Retroviral expression of a normally spliced cDNA for one of these 
targets, Senp2, partially corrects the survival defect of Hnrp//-mutant T cells. We find that integrating a number of 
computational methods to detect genes with differentially retained introns provides a strategy to enrich for 
alternatively spliced exons in mammalian RNA-seq data, when complemented by RNA-seq analysis of purified cells 
with experimentally perturbed RNA-binding proteins. 

Conclusions: Our findings demonstrate that intron retention in mRNA is induced by specific RNA-binding proteins 
and suggest a biological significance for this process in marking exons that are poised for alternative splicing. 



Background 

Splicing of introns from pre-messenger RNA is a tightly 
regulated process performed by the spliceosome [1]. The 
assembly of the spliceosome at appropriate exon-intron 
boundaries is influenced by regulatory RNA-binding pro- 
teins (RBPs) that operate antagonistically to both enhance 
and repress this splicing machinery [2-4]. Heterogeneous 
nuclear ribonucleoproteins (hnRNPs) predominantly 
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perform a splicing repressor function and their binding 
blocks spliceosome assembly and leads to exclusion of 
exons from mature mRNAs. Serine/arginine-rich (SR) pro- 
teins bind enhancer sequences in RNA and promote asso- 
ciation of spliceosomal proteins at splice sites. This general 
model nevertheless requires testing in physiological experi- 
mental systems where developmentally regulated splicing 
factors are specifically perturbed, but few such systems 
have been described, especially for mammalian cells. 

Recent information has identified that transcription 
and splicing are functionally coupled, allowing cross-talk 
between the two processes [5]. Co-transcriptional assem- 
bly of the spliceosome on nascent, chromatin-associated 
RNA transcripts is a significant factor in correct gene 
regulation in yeast [6], Drosophila [7] and mammals 
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[8-10]. The spliceosome has been shown to quickly asso- 
ciate with chromatin-associated nascent RNA in human 
HeLa cells [11] and slowing the rate of RNA polymerase 
II transcription in Drosophila greatly increases the rate 
of alternative exon inclusion [12]. High-throughput se- 
quencing of chromatin-associated, nascent RNA in Dros- 
ophila has revealed that the majority of introns are 
co-transcriptionally spliced at least half of the time, 
though a minority of introns are spliced slowly and some 
appear never to be co-transcriptionally spliced [7]. This 
variability in co-transcriptional splicing efficiency occurs 
even within single transcripts and suggests that splicing 
is regulated at the level of the intron [7], presumably by 
different RBPs such as the hnRNPs and SR proteins. In- 
trons that are consistently identified to be resistant to 
co-transcriptional splicing correlate with annotated al- 
ternative exons [7,9,13]. 

To understand mammalian alternative splicing, and de- 
fine the relationship between variable intron retention after 
transcription and alternative splicing, it would be useful to 
be able to experimentally perturb developmentally regu- 
lated alternative splicing events through genetic mutations 
in the specific RBPs that control them. One of the best de- 
fined mammalian alternative splicing events occurs in the 
Ptprc gene encoding the major plasma membrane tyrosine 
phosphatase, CD45, in T lymphocytes and other blood leu- 
kocytes [4,14]. In memory T cells that have been activated 
previously by antigens, exons 4, 5 and 6 are skipped in the 
translated Ptprc mRNA. The resulting loss of the CD45- 
RA, RB and RC domains in the extracellular domain of 
the protein, detected by flow cytometric staining with 
specific antibodies, is used as the primary marker to differ- 
entiate memory T cells and activated T cells (CD45-RO^) 
from naive T cells (CD45-RA+ or CD45-RB+). Even in 
naive T cell mRNA all three Ptprc cassette exons are rarely 
included whereas they are all included in B lymphocyte 
Ptprc mRNA, resulting in the CD45R-ABC isoform 
(B220) that is detected by specific monoclonal antibodies 
to identify B cells. 

Silencing of Ptprc exons 4, 5 and 6 in T cells requires 
hnRNPLL, a protein with three RNA-recognition motif 
(RRM) domains whose mRNA expression correlates 
with Ptprc exon exclusion: it is highest in CD45RO^ ac- 
tivated and memory T cells that exclude exons 4 to 6, at 
intermediate levels in CD45RB^ naive T cells, and at very 
low levels in CD45RABC'^ B cells that include all three 
exons [15-17]. Mice homozygous for a destabilizing 
point mutation in the amino-terminal RRM domain, 
Hrnpll"^", fail to exclude exons 4, 5, and 6 in T-cell Ptprc 
mRNA and expression of CD45-RA and CD45-RC pro- 
tein isoforms are increased 50-fold on different T-cell 
subsets [16]. Likewise, increased inclusion of Ptprc exons 
4 to 6 occurs when hnRNPLL is depleted from human T 
cells by short hairpin RNA (shRNA) expression, while 



silencing of Ptprc exon 4 is induced in human T cells 
transfected to overexpress Hnrpll cDNA [15,17]. The 
isolated amino-terminal RRM domain normally binds 
with sequence specificity and micromolar affinity [16] to 
an RNA consensus sequence, the activation response se- 
quence (ARS), which mediates exon silencing in acti- 
vated T cells and occurs in each of Ptprc exons 4, 5 and 
6 [18]. Thus, hnRNPLL is a developmentally regulated 
Ptprc splicing silencer whose expression and activity are 
critical for the regulated changes in CD45 isoforms on T 
and B lymphocytes. 

A closely related protein, hnRNPL, has also been 
shown to bind Ptprc ARS RNA sequences present in 
exons 4 to 6 [19,20]. T cells from mice homozygous for 
a knockout of the Hnrpl gene have moderately increased 
inclusion of exons 4 and 6, resulting in a four-fold in- 
crease in CD45RA expression [21]; compared with a 50- 
fold increase caused by Hnrpll mutation. Thus, hnRNPL 
and hnRNPLL both contribute to exon silencing but 
their coordinated actions are only partly understood [4] . 

The interphase lifespan of Hrnplf^^ homozygous T cells 
is greatly shortened, resulting in decreased numbers of 
naive T cells in the circulation [16]. This effect of 
hnRNPLL deficiency occurs even in T cells with a null 
Ptprc gene [22], indicating that hnRNPLL controls other 
genes contributing to T cell persistence that have yet to be 
identified. Here we use this mammalian system to analyze 
the consequences of perturbing hnRNPLL either by muta- 
tion or natural expression differences, as revealed by glo- 
bal mRNA changes measured by RNA-seq. hnRNPLL was 
required to induce a distinct pattern of intron retention 
surrounding its known target cassette exons in Ptprc. Dif- 
ferential retention of introns in deeply sequenced RNA 
provided a signature that could be used to identify other 
mRNAs requiring hnRNPLL for correct splicing in T cells, 
and this represents a general, albeit not completely spe- 
cific, strategy to annotate the transcriptome for exons that 
are likely to undergo differential splicing. 

Results 

Alternative introns are differentially retained in Ptprc 
transcripts 

In a mouse genetic screen we previously identified 
hnRNPLL as an essential regulatory factor responsible for 
skipping of Ptprc exons 3, 4 and 5 in T cells. Thunder mice 
have a loss-of-function point mutation in the Hnrpll gene 
that destabilizes the amino-terminal ARS-binding RRM 
domain of hnRNPLL [16]. To analyze the consequences 
of Hnrplf'" upon T-cell mRNA splicing, we performed 
RNA-seq on mRNA isolated from CD8* T lymphocytes 
purified from transgenic OT-1 T-cell receptor (TCR) mice 
that were either wOd type or homozygous for Hrnplf^". 
The OT-1 TCR transgenes possessed by this mouse pro- 
vided a uniform T cell antigen receptor on the CD8^ 
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T cells with specificity for a known ovalbumin-derived 
peptide and ensured that most of the cells were antigeni- 
cally naive. This resulted in comparable frequencies of 
naive and memory T cells in the Hmplf^" animals and the 
wild-type controls, enabling the analysis to focus on the 
primary effects of the mutation on splicing and minimize 
the contribution of //wr/i/Z-independent differences in 
mRNA splicing between naive and memory T cells. 

RNA-seq was performed on biologically independent 
replicate samples starting with RNA purified from T cells 
and using oligo-dT to prime first strand cDNA synthesis. 
The resulting libraries were each sequenced to a depth of 
greater than 100 million single-end 125 bp reads on an 
lUumina GAIIx sequencer, and aligned to the mouse gen- 
ome (mm9) using TopHat with Bowtie [23]. To confirm 
that the majority of aligned sequence data came from 
spliced polyadenylated mRNA, we compared the ratio of 
intronic reads to exonic reads. For each intron in the 
9,162 genes expressed at more than a mean of 20 reads 
per exonic nucleotide, we calculated the number of reads 
in the last 25 bp of each intron and divided it by the num- 
ber of reads in the adjacent first 25 bp of the exon, to yield 
the ratio of reads across the 3' splice site (3'SS ratio [7,9]) 
(Figure 1). This showed that almost all mRNAs are com- 
pletely spliced with a mode of the intron retention less 
than 0.01 and a median of 0.019. 

The Ptprc gene contains 33 exons spanning 112 kb on 
chromosome 1 (Figure 2a), and is highly expressed in T 
cells with a median of >3,300 reads per exonic nucleo- 
tide in each of our CDS T-cell RNA-seq datasets. The 
majority of Ptprc introns were spliced out of 99% of the 
sequenced RNA, as illustrated by the large difference in 
read coverage over constitutive exons 9 to 29 compared 



to the intervening introns (Figure 2b) and by a median 
3'SS ratio of 0.01 for Ptprc introns as a whole. In wild- 
type CD8^ cells (CD45-RB'''sh_ j^^iow j^giow^ ^^^^^ 

was strong silencing of exons 4 and 6, which contain 
multiple ARS, whereas the more abundant exon 5, en- 
coding the CD45-RB isoform, was better represented in 
sequence reads. The hnRNPLL-dependence of this dif- 
ferential splicing is readily apparent from Figure 2c - the 
two wild-type T cell samples had a mean of 777 and 
1,639 reads per nucleotide in exons 4 and 6, respectively, 
compared with a mean of 3,735 for all Ptprc exons, 
whereas in the Hnrplf"^ T cells exon 4 and 6 were cov- 
ered by a mean of 2,747 and 3,766, respectively, com- 
pared to a mean of 3,766 for all Ptprc exons. 

Surprisingly, an abundance of reads was also observed 
in wild-type T cells selectively over Ptprc introns 3, 4, 5, 
6 and 7 compared with the other introns in the gene. 
The median read depths over introns 3, 4, 5, 6 and 7 
were 310, 366, 746, 759 and 552 reads, respectively, and 
were significantly higher (7^= 1.6 x 10'*, Mann- Whitney, 
W = 200) than the mean depths over all other introns 
(median read depth of 16). Mutant Hnrplf'^" cells also 
showed significantly increased reads in the same introns, 
though at a universally lower level than wild-type cells. 
Hence, wild-type CDS* T cells display a greater depth of 
intronic reads and inversely fewer exonic reads relative 
to mutant cells through the region of the Ptprc tran- 
script known to undergo alternative splicing. The me- 
dian read depth over these retained introns in wild-type 
cells is between S to 20% as abundant as sequence reads 
over all Ptprc exons. This indicates that, in wild-type 
CDS* T cells, approximately S to 20% of the sequenced 
Ptprc mRNA had not yet removed either exon 4 and 
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Figure 1 Most transcripts in wild-type and thunder CD8* T cells are fully spliced, yet a minority retain one or more introns. The genome- 
wide distribution of intron splicing frequency is skewed towards full removal (0% intron retention), but also includes fully unspliced transcripts (100% 
intron retention). The majority of introns (87%) have a splicing efficiency of greater than 90%, meaning they remain unspliced in fewer than 10% of 
polyadenylated mRNAs; 0.94% of introns have a splicing efficiency of less than 50% and were retained in more than 50% of polyadenylated mRNAs 
present in our RNA-seq data. 
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Figure 2 Increased intron retention in wild-type T cells correlates with exon-skipping events in Ptprc. (a) Exons 4, 5 and 5 of the Ptprc 
(CD45) gene are alternatively spliced in T cells and may be combined to produce eight distinct CD45 isoforms. The longest isoform, CD45RABC, is 
primarily expressed in B cells whereas T cells express different CD45 isoforms through their development and activation, (b) RNA-seq data along 
the length of the Ptprc gene show the introns flanking alternative exons 4, 5 and 6 are covered by a greater depth of sequence reads and 
indicates incomplete splicing of these introns in both wild-type (1/1/T; in red) and thunder (THU; in blue) CDS '" T cells, (c) Read depth over retained 
introns is consistent within biological replicate samples, but is different between wild-type (red) and thunder (blue) samples, as shown in the 
purple trace, which plots the wild-type minus thunder per nucleotide read depth through the alternatively spliced region of the Ptprc gene. 
Introns and exons are labeled as I- and E-, respectively, followed by their corresponding number. Thunder mutants with a hypomorphic Hnrpll 
gene show fewer reads aligned to introns 3, 4, 5, 6 and 7. 



exon 6 nor the introns that precede and follow these 
exons, yet 99% of the sequenced Ptprc mRNA had re- 
moved constitutively spliced introns. It is important to 
note that deep sequencing of RNA provides a unique 
opportunity to measure mammalian intron retention in 



a relatively unbiased manner, in contrast to PCR-based 
methods that are biased against detecting the very large 
intronic inserts. 

As an independent test of the effect of diminished 
hnRNPLL activity on lymphocyte mRNA splicing and 
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intron retention, we compared RNA-seq data from 
magnetic-bead enriched CD4* T cells (Hnrpll-high.) and 
CD19* B-cells (Hnrpll-low) (Figure 3a). As expected for 
bead-based enrichment methods, the B-cell-enriched 
samples still contained an estimated 7% T cells based on 
mean expression of a set of T-cell-specific mRNAs, 



assuming exclusive T-cell expression of these genes. 
Conversely, using the same measure, the T-cell-enriched 
samples retained an estimated 1% B cells. In CD19* B 
cells, the median sequence read depth over Ptprc exons 
4, 5 and 6 was 344, which is greater than the median 
read depth of 165 over all Ptprc exons, demonstrating 
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Figure 3 Ptprc intron retention correlates with hnRNPLL expression in wild-type T cells and low expression in B cells, (a) CD19* B cells 
(blue traces), which have naturally low expression ot Hrnpli, display decreased intron retention compared with CD4* T cells. The schematic 
diagram shows RNA-seq data prepared from three independent samples of wild-type lymphocytes enriched by magnetic beads for CD4* T cells 
or CD19"^ B cells. The difference in intron and exon read depths between T and B cells is shown in purple, (b) The pattern of decreased intron 
retention is recapitulated in an independent RNA-seq dataset from the ImmGen project [24] for wild-type T and B cells purif ed by fluorescence- 
activated cell sorting. T cells (red trace) show increased read depths over introns flanking the alternative exons of Ptprc, compared with B cells 
(blue trace). The difference in read depth over introns and exons is indicated by the purple trace. 
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Strong inclusion of these three exons as expected, since 
B cells express very little hnRNPLL. In CD4* T cells, 
which do express hnRNPLL, the read depth over exons 
4 and 6 was lower (median 130) than the median 
depth over Ptprc exons as a whole (192), confirming 
silencing of these exons. A reciprocal pattern was ob- 
served for the introns flanking exons 4 to 6, where 
there was a higher depth of reads in T cells (median of 
54) and a lower depth of reads in B cells (median of 
23). Hence, B cells with naturally low Hnrpll expres- 
sion splice out introns 4 to 6 more efficiently than T 
cells that express Hnrpll. 

As further evidence of differential intron retention be- 
tween T and B cells, we analyzed RNA-seq data gener- 
ated by the ImmGen project [24] from polyadenylated 
mRNA isolated from highly purified CD4^ T cells and 
CD19* B cells prepared using fluorescence-activated cell 
sorting (Figure 3b) [25]. Ptprc introns 4 to 6 exhibited 
numerous reads in T cells, with a median of 144 com- 
pared to 1,655 for all Ptprc exons. By contrast, very few 
intron 4 to 6 reads were obtained from B-cell mRNA, 
with a read depth of 5 compared to a median of 1,071 
for all Ptprc exons in the same sample. This indicates 
that introns 4 through 6 are efficiently spliced out of 
99.5% of Ptprc mRNA in B cells that have little 
hnRNPLL, but are retained in 9% of the corresponding 
mRNA from CD4 T cells. 

Retained introns are frequent and correlate with 
alternatively spliced exons 

Given the findings above, we asked how many other 
mRNA species show a retained intron in normal mouse 
T cells. The 3'SS ratio data shown in Figure 1 suggest 
that up to 13% of all introns in polyadenylated mRNA 
remain unspliced in more than 10% of the correspond- 
ing mRNAs (3'SS >0.1). A smaller subset of introns 
(0.94% of all introns) remain unspliced in more than 
50% of the mRNAs that contain them. We therefore 
asked if exons observed with a retained flanking intron 
commonly undergo alternative splicing, by generating a 
list of exons that had a 3'SS ratio >0.1 in wild-type CDS 
T cells from OTl mice. In total, 6,797 exons had a 3'SS 
ratio greater than 0.1, derived from 3,248 genes (Figure 4, 
Table 1). The median intron retention (IR) ratio in de- 
tectable genes was 0.019 in wild-type T cells. When 
compared with annotated alternative exons in the 
Ensembl mouse gene set [26] (NCBIM37, Ensembl re- 
lease 67), this list showed a highly significant, non- 
random overlap with the alternative splicing categories 
of cassette exons, mutually exclusive exons and intron 
retention (according to the nomenclature of [27]) 
whereas constitutive exons were found significantly less 
than expected (permutation P-value = 0.001; Figure 4, 
Table 1). The alternatively spliced Ptprc exons 4 and 6 



that are skipped in Hnrpll wild-type T cells had a 3'SS 
ratio of 0.7 and 0.5, respectively, placing them within the 
top 8% of all exons ranked by largest 3'SS ratio. Hence, 
3'SS ratios as a simple measure of intron retention can 
be used to enrich for mammalian exons that undergo 
physiological alternative splicing. 

Detection of differentially retained introns in Hnrplf^" 
mutants 

The data presented above imply that binding of 
hnRNPLL to exons in pre-mRNAs delays the excision of 
flanking introns from polyadenylated mRNA. We there- 
fore sought other hnRNPLL-regulated genes using algo- 
rithms to detect differential intron retention or exon 
exclusion between wild-type and Hnrplf^" CD8^ T cells. 
Using the DEXSeq method [28] we compared the rela- 
tive intron read counts within each annotated mouse 
gene between wild-type and Hnrplf^" CD8* T cells from 
OTl mice. This identified 138 introns from 114 genes 
with significantly different numbers of reads between 
wild type and mutant, including intron 6 from the Ptprc 
gene, which ranked second most differentially retained 
according to P-value {Ptprc introns 3, 4, 5 and 7 were 
also within the top six most significantly retained in- 
trons; Additional file 1). Intron 1 of the Ctse gene was 
more highly retained than all Ptprc introns, and introns 
from the Ddb2, Senp2, TrimSOa and Atp2cl genes com- 
prised the remainder of the top 10 highest ranked in- 
trons by DEXSeq. Visual inspection of reads for these 
and other highly ranked genes using the Integrative 
Genomics Viewer (IGV) [29] showed clear cases of 
differential intron retention between wild-type and 
Hnrplf''" T cells, although these were generally less 
striking than that observed in Ptprc, most likely due to 
these mRNAs being less abundant. 

To identify differential exon exclusion we used DEX- 
Seq to compare the relative read depths for each exon 
within annotated mouse genes between wild-type and 
Hnrplf''" T cells, and ranked exons with significant dif- 
ferences according to P- value (Additional file 2). Ptprc 
exons 4 and 6 were ranked first and second, respectively. 
Of the 138 introns with DEXSeq evidence of differential 
intron retention above, 18 of these had at least one 
flanking exon with evidence of differential exon exclu- 
sion. These intron-exon pairs were from the Ptprc, 
Degsl, Ms4a6c, Cdc42, Il7r, Ctse, GimapS, Slfnl, Rpl29, 
Mrpl3 and Ighg2c genes. 

As a third method to identify hnRNPLL-regulated spli- 
cing events, we surveyed differential splice junction 
usage. We used DEXSeq to compare the reads aligned 
to all splice junctions between wild-type and Hnrplf^" 
T cells, which found 588 significant junctions out of 
230,416 in total (at a false discovery rate of 0.1), three- 
quarters of which show increased usage in T cells from 
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Figure 4 Intron retention ratio is correlated withi cassette exon, mutually exclusive exon and intron retention splicing events. 

Frequency of matched alternative splicing event types in 1,000 random re-sampling exons representing the background distribution is compared 
with the observed number of each splicing event (highlighted in red dotted line) for the exons with increased intron retention (3'SS IR >0.01; 
CNE, constitutive exon; CE, cassette exon; MXE, mutually exclusive exon; IR, intron retention; A3SS, alternative 3' splice site; A5SS, alternative 
5' splice site; Table 1). CE, MXE and IR types are significantly enriched, whereas CNE is significantly reduced compared with background (at false 
discovery rate P-value = 0.001). A3SS and A5SS are not found to be enriched. 



thunder mice, indicating greater splicing diversity is as- 
sociated with mutant hnRNPLL (Additional file 3). 
Among 57 junctions that were equally highest-ranked 
with an adjusted P-value of 0, splice junctions that 
connect alternative exons in the Ptprc gene were found 
with a reciprocal pattern of exon-joining versus exon- 
skipping. Junctions joining exon 8 and exon 3, exon 7 
and exon 3, exon 7 and exon 5, and exon 5 and exon 3 
had higher relative sequence coverage in wild-type T 



cells, whereas junctions joining exon 6 and exon 5, exon 
5 and exon 4, and exon 4 and exon 3 had higher cover- 
age in Hnrplf^" T cells. Of the top ten genes ranked by 
evidence of intron retention above, five of these also 
contained significant differentially altered splice junction 
usage flanking these introns {Ptprc, Ctse, TrimSOa, Rab3- 
gap2 and Slcl2a7). 

Figure SI in Additional file 4 shows a Venn diagram of 
the overlap of gene identifiers with significant differential 



Table 1 Intron retention ratio is correlated with cassette exon, mutually exclusive exon and Intron retention 
splicing events 



Alternative splicing 
event type 


Number of annotated 
events in ENSEMBL for all 
exons (n = 38,911) 


Number of matched events 
for exons with 3'SS IR >0.10 
(n = 6,658) 


Mean number of matched events 
in background (1,000 x re-sampling 
6,658 exons from 38,91 1) 


Permutation 
P-value 


CNE 


19,513 


2,538 


3,338 


0.001 


CE 


8,641 


1,936 


1,479 


0.001 


MXE 


1,265 


294 


216 


0.001 


IR 


3,614 


749 


620 


0.001 


A3SS 


1,249 


208 


213 


0.636 


A5SS 


1,242 


243 


213 


0.009 



Number of alternative splicing event types mapped within 1) all exons in expressed genes and 2} exons having at least 10% of intron retention in their 3'SS, and 
the mean number of the event types within randomly sampled exon sets of the same size. CNE, constitutive exon; CE, cassette exon; MXE, mutually exclusive 
exon; IR, intron retention; A3SS, alternative 3' splice site; A5SS, alternative 5' splice site. 
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intron retention, exon inclusion or alternative splice junc- 
tion ratios. While a total of 9 genes {Ptprc, Ctse, Ighg2c, 
Wdr82, Ms4a6c, Gnai2, Pik3r4, Degsl, Rrp9) have signifi- 
cant differential intron retention, exon inclusion and al- 
tered splice junction usage (and another 51 with just two 
of these features), they are a minority compared with the 
total count of genes with one of these features. Hence, ei- 
ther methodology issues and noise in the RNA-seq data 
may hamper the identification of alternative splicing 
events associated with intron retention, or intron reten- 
tion may not always be associated with alternative splicing. 
It seems plausible that intron retention induced by 
hnRNPLL binding may delay splicing of certain introns, 
but may not ultimately alter the splicing fate of the pre- 
mRNA molecule. 

hnRNPLL-dependent splicing in mouse T cells 

The preceding analysis was validated in the sense that it 
identified known hnRNPLL-dependent splicing events in 
Ptprc, so we used reverse transcriptase polymerase chain 
reaction (RT-PCR) to validate highly ranked splicing 
events in other genes. Ten candidate alternative exons 
from distinct genes were selected, choosing those genes 
that ranked highly in one or more of the tests described 
in the previous section, and showed a clear differential 
pattern of read distribution between cell types when 
inspected visually using the IGV [29]. Oligonucleotides 
complementary to flanking exons were used to amplify 
PGR products from oligo-dT-primed cDNA prepared 
from Hnrpir"" and Hnrplf^" mRNA, and the products 
analyzed by gel electrophoresis (Figure 5). In addition 
to Ptprc as a positive control, six of the ten candidate 
genes, Senp2, Ctse, Trpv2, Ashll, Slcl2a7, Lck, displayed 
differential band sizes or multiple bands that differed 
between T cells with normal or mutant hnRNPLL 
(Figure 5). Interestingly, these genes all showed evidence 
for differential exon usage, but none showed bands at- 
tributable to retention of a long intron. This likely can 
be attributed to PGR bias towards shorter products and 
long intron-containing transcripts not being amplified, 
which is a bias not shared with RNA-seq. Alternative 
splicing events found in only three of these genes {Ptprc, 
Trpv2, Slcl2a7) are currently identified by mouse 
expressed sequence tags (ESTs) from GenBank [30], in- 
dicating the need for more comprehensive annotation of 
alternative splicing by RNA-seq studies. 

Browsing the RNA-seq reads data for these genes with 
IGV indicated that, like Ptprc, three genes {Senp2, Ctse, 
Slcl2a7) appeared to have greater intron retention in 
wild-type cells than in Hnrplf^" T cells, whilst the con- 
verse was true for two other genes (Trpv2, Ashll) and 
indeterminate for one (Lck) (Figure S2 in Additional 
file 4). High intron retention and exon skipping occurred 
in wild-type T cells for Senp2 and Ctse; for Ashll and 



Slcl2a7, however, high intron retention in Hnrpir"^ T 
cells was associated with exon inclusion. This indicates 
that either hnRNPLL binding can have both a splicing 
repressor and enhancer role in wild-type cells, or that 
potentially some of these genes are indirectly regulated 
by hnRNPLL or are regulated in concert with another 
RBP. For all validated candidate genes, the observed al- 
ternative splice junction usage from RNA-seq reads sup- 
ports the alternative products observed with PGR. 

For three candidate hnRNPLL targets {Senp2, Ctse and 
Ashll) the alternative junctions detected by RNA-seq 
were outside the significance threshold applied in the 
DEXSeq analysis (Figure S2 in Additional file 4). Notable 
from the RNA-seq data for these genes was the enrich- 
ment of cryptic junction reads. Generally, cryptic splice 
sites are rare in the RNA-seq data, but four genes 
{Senp2, Ctse, Trpv2, Slcl2a7) had a diverse number of 
cryptic splice junctions in their retained introns and 
flanking the identified alternative exons. Another feature 
of these differentially spliced genes was minor unanno- 
tated alternative exons. Senp2, Ashll and perhaps 
Slcl2a7 and Trpv2 have differentially included alterna- 
tive exons that lie within retained introns. Within Senp2, 
a differentially included, unannotated, alternative exon 
appears in Hnrplf''" cells, which is only present at trace 
levels in wild type. Further investigation of this cryptic 
exon in Senp2 by sequencing PGR bands showed that it 
was included in variable lengths with alternative 3' 
splice sites (Figure S3 in Additional file 4). Inclusion of 
this cryptic exon introduced an in-frame stop codon that 
we estimated caused 23% of the protein to be truncated 
in Hnrplf^" cells (calculated as the percentage of cryptic 
junctions compared with canonical junctions). 

By combining each of the features of hnRNPLL- 
dependent alternative splicing events observed above, 
we developed a simple scoring scheme based on the sum 
of Boolean values for each feature (Table 2). These fea- 
tures were intron retention, exon exclusion and splice- 
junction usage (determined by DEXSeq) along with the 
presence of cryptic junctions and high 3 - or 5'SS ratios. 
We scored all introns by these features and ranked them 
accordingly. Ptprc and four other PGR-validated genes 
{Senp2, Ctse, Ashll, Slcl2a7) ranked within the top 25, 
while Trpv2 and Lck ranked 63 and 164, respectively. A 
further 136 genes had introns with scores that ranked 
them among the validated genes and were likely candi- 
dates for hnRNPLL-dependent splicing. We selected 15 
of these candidate genes after visual inspection of the 
RNA-seq read data with IGV and conducted further RT- 
PGR validation of alternative splicing. The results identi- 
fied 8 of these 15 genes {Degsl, Sidtl, MapkapkS, Herc3, 
Ikbke, CepllO, Mllt6, Raplgdsl) with differential band 
sizes between hnrplt^ and hnrplf^^ T cells (Figure 6). 
Table 2 shows the scoring matrix of Boolean values of 
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Figure 5 (See legend on next page.) 




Cho et al. Genome Biology 2014, 15:R26 
http://genomebiology.com/201 4/1 5/1 /R26 



Page 1 0 of 1 7 



(See figure on previous page.) 

Figure 5 Altered mRNA splicing between wild-type and mutant Hnrpll CDS* T cells validated with RT-PCR. A selection of ten genes that 
ranl<ed liiglily in one or more tests for differential alternative splicing in Hnrpll"^" T cells were validated using RT-PCR. Six genes showed differential 
PCR bands between samples and are show alongside an ideogram of the inferred sequence included in each product, determined from band 
sizes and the expected included sequence indicated by RNA-seq junction read information, (a) Ptprc gene, oligonucleotide PCR primers located 
in exons 2 and 7, amplifying across the regions of alternative exons 4, 5 and 6. (b) Senp2 gene, primers located in exons 8 and 1 1, amplifying the 
intron between exons 10 and 1 1 containing a variably included unannotated or cryptic exon. The sequence of each band in the accompanying 
ideogram was confirmed by Sanger sequencing, (c) Ctse gene, primers located in exon 1 and 2, amplifying unannotated exon in intron 1. 
(d) Trpv2 gene, primers located in exons 1 and 3 to amplify unannotated exons in introns 1 and 2. (e) Ash! I gene, primers located in exons 20 
and 21, spanning variably included unannotated exon in intron 20. (f) 5lcl2a7 gene, primers located in an alternative first exon, such that 
preference to the other first exon there would be no product, (g) Lck gene, primers were designed with a forward primer spanning across exons 
1 and 4 and a reverse primer in exon 5, amplifying products of exon 1 and 4 joining (skipping exons 2 and 3) with variable length of exon 4. 
Oligonucleotide primer locations are displayed against read depth and gene intron/structure in Figure S2 in Additional file 4. WT, wild type. 



these high ranking genes confirmed by RT-PCR. Across 
these genes and those identified earUer, gene set enrich- 
ment analysis did not indicate significant shared Gene 
Ontology terms, common pathways or other functional 
annotations. 

Normally spliced Senp2 cDNA restores Hnrpll'''" T-cell 
survival 

The lifespan of Hrnplfi^^ homozygous T cells is greatly 
shortened, resulting in decreased numbers of naive T 
cells in the circulation [16] due to effects on hnRNPLL 
targets other than Ptprc [22]. We focused on Senp2 as 
an additional candidate, since it was highly ranked and 
validated with strong evidence of intron inclusion and 
cryptic splicing that would decrease the pool of normal 
Senp2 protein in the analysis of Hrnplf^^ homozygous 
OT-1 T cells above, and because it is an essential regula- 
tor of protein sumoylation with the capacity to affect 
many aspects of cell survival in non-lymphoid cells 
[31,32]. OT-1 CDS T cells with wild-type or homozygous 
mutant Hnrpll were activated briefly in culture, trans- 
duced with a retroviral vector encoding normally spliced 
Senp2 cDNA or an empty vector control, and re- 
implanted into normal C57BL/6 mice. Individual trans- 
ferred CDS T cells were enumerated in blood samples 
collected from the recipients on day 3 and again on day 
17, using flow cytometric staining for a donor-specific 
CD45.1 marker (Figure 7a). Wild-type T cells transduced 
with Senp2 or empty vector were frequent in the circula- 
tion on day 3, and persisted at 40% of this frequency on 
day 17. Hnrpll mutant T cells transduced with empty 
vector were less frequent on day 3, and had declined to 
11% of this frequency by day 17, consistent with previ- 
ous evidence that these T cells have a very short in vivo 
lifespan [16]. By contrast, mutant T cells transduced with 
Senp2 vector were present at higher frequency on day 3 
and persisted at 45% of this frequency by day 17. Im- 
proved survival of Hnrpll mutant T cells transduced with 
Senp2 compared with control vector, but not of Hnrpll 
wild-type T cells transduced with the same vector com- 
bination, was observed in two independent experiments 



(Figure 7b). Flow cytometric staining of the persisting 
Hnrpll mutant T cells showed that they nevertheless 
continued to be unable to silence Ptprc exon 6, display- 
ing 10 times higher CD45RC on their cell surface com- 
pared with the transferred wild-type T-cell controls 
(Figure 7c). This result, together with isolation of an 
ENU-induced exon 10-11 splicing mutation in Senp2 
that also results in shortened T-cell survival (Yan Mei, 
Anselm Enders, Edward M Bertram and Christopher C 
Goodnow, unpublished observations), provide evidence 
that Senp2 is an important functional target for Hnrpll- 
dependent T-cell survival. 

Discussion 

Our study develops a strategy for identifying regulated 
mRNA splicing events with RNA-seq of total polyA + 
RNA, using one of the best defined mammalian models of 
developmentally regulated alternative splicing in the Ptprc 
gene and a mutation that cripples hnRNPLL, a key RBP 
regulating this event. We have shown that introns flanking 
the excluded exons in this gene are retained in oligo-dT 
primed cDNA from T cells that express hnRNPLL, 
whereas these introns are efficiently spliced out of most of 
the mRNA in lymphocytes with defective or naturally low 
expression of hnRNPLL. Intron retention promoted by 
hnRNPLL appears to be a general feature of the transcrip- 
tome in T cells and we identified and validated a further 
14 genes showing a similar pattern of intron retention and 
alternative splicing. Functional evidence for the import- 
ance of one of the highest-ranked additional targets, 
Senp2, was obtained by retroviral cDNA expression restor- 
ing survival of Hnrpll-muXant T cells. The computational 
methodology we have applied to identify hnRNPLL- 
dependent splicing events in comparison between wild 
type and mutant cells is readily generalizable to identify 
the program of other regulatory RBPs that dictate patterns 
of alternative splicing, using small samples of total RNA 
that are readily obtained from purified ex vivo subsets of 
primary mammalian cell types. 

We have shown that partially spliced polyadenylated 
mRNAs, rather than being artifacts of RNA-seq data. 
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Table 2 All candidate hnRNPLL-regulated splicing events prioritized for RT-PCR confirmation 


Gene 


Intron location 




Feature scores 




Score 


Rank 






3'SS ratio 5'SS ratio 


Differential 
exon 


Differential Differential 
intron junction 


Cryptic 
junctions 






Ptprc 


1:140013603-140014330 


1 1 


1 


1 1 


1 


6 


1 


Degsl 


1:184207025-184208627 


1 1 


1 


1 1 


1 


6 


2 


Ptprc 


1:140016744-140019032 


1 1 


1 


1 1 




5 


3 


Ptprc 


1:140012220-140013530 


1 0 


1 


1 1 


1 


5 


4 


Ptprc 


1:140019162-140022879 


1 0 


1 


1 1 




4 


6 


Slcl2a7 


13:73870786-73901144 


1 0 


0 


1 1 


1 


4 


7 


Slcl2a7 


13:73901339-73921966 


0 1 


0 


1 1 


1 


4 


8 


Ptprc 


1:140014472-140016596 


1 1 


1 


1 1 




5 


11 


Rab3gap2 


1:187087739-187090779 


1 0 


0 


1 1 


1 


4 


13 


117! 


15:9440015-9440865 


0 0 


1 


1 0 


1 


3 


15 


Ctse 


1:133535135-133558452 


0 0 


1 


1 0 


1 


3 


16 


Sidtl 


16:44258155-44259545 


0 0 


0 


1 1 


1 


3 


17 


Il7r 


1 5:9438538-9439920 


0 0 


1 


1 0 


1 


3 


20 


Il7r 


15:9438114-9438461 


0 0 


1 


1 0 


1 


3 


21 


Ash 11 


3:88871164-88872908 


0 1 


0 


1 0 


1 


3 


23 


Senp2 


16:22032238-22036542 


1 1 


0 


1 0 


1 


4 


24 


Rnf167 


1 1 :70463837-70464265 


0 1 


0 


0 1 


1 


3 


26 


DUAbble 


14:28263236-28272964 


0 0 


0 


0 1 


1 


2 


33 


Sept7 


9:25104073-25105396 


0 0 


0 


0 1 


1 


2 


37 


MapkapkS 


9:107166096-107182976 


0 1 


0 


1 0 


, 


3 


44 


TrimSS 


7:109828211-109828650 


1 1 


0 


0 1 


1 


4 


48 


Sigirr 


7:148281217-148286377 


1 0 


0 


0 1 


1 


3 


50 


Trpv2 


11:62388103-62389048 


1 0 


0 


0 1 


1 


3 


62 


Herc3 


6:58824395-58826482 


1 0 


0 


1 0 


1 


3 


66 


Ighg2c 


12:114524254-114525607 


0 1 


1 


1 0 




3 


71 


Pik3r4 


9:105580651-105584514 


0 0 


0 


1 0 




2 


80 


Ikbke 


1:133162866-133166377 


0 0 


0 


1 0 




2 


83 


CepUO 


2:35013309-35015767 


0 0 


0 


0 1 




2 


84 


Fan 


3:87185307-87185682 


0 0 


0 


0 1 




2 


87 


M!lt6 


11:97538551-97539627 


0 0 


0 


1 0 




2 


104 


Raplgdsl 


3:138619284-138620373 


0 0 


0 


0 1 




2 


131 


Lck 


4:129234700-129234869 


0 0 


0 


0 1 




2 


163 



Number of alternative splicing event types mapped within 1) all exons in expressed genes and 2) exons having at least 10% of intron retention in their 3'SS, 
the mean number of the event types within randomly sampled exon sets of the same size. 



serve as markers of alternatively spliced exons and that 
the retention of introns in these transcripts is a regu- 
lated process. Previous studies of nascent mRNA have 
employed technically demanding nuclear fractionation 
before RNA isolation [7,8,33]. Our results are built on 
an emerging literature that demonstrates incompletely 
spliced, polyadenylated mRNAs - including unspliced in- 
trons 3 to 7 in Ptprc mRNA from macrophages - are fre- 
quently retained on chromatin as well as being present 
in the nucleoplasm [8,10,33,34]. Incompletely spliced. 



polyadenylated mRNAs are readily identified in RNA- 
seq datasets generated from whole-cell mRNA prepara- 
tions [10,35,36] and differential intron retention between 
different tissues and cell types has also been observed 
[7,13,36]. The cause of the differing intron excision in 
these studies is mostly unresolved and may stem from 
various factors, including intron length, strength of 5' 
splice sites, RNA polymerase II pausing, and the binding 
of SR proteins to introns [33]. Our study addresses this 
question by establishing a strong correlation between 
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Figure 6 RT-PCR validation of differential mRNA splicing of genes predicted by the Boolean scoring method. Genes with RNA-seq features 
associated with differential exon inclusion and intron retention were scored and ranked alongside a set of seven genes that we validated by RT-PCR to 
be differentially spliced. Fifteen genes that ranked by score among these validated cases were selected for further RT-PCR, of which eight also showed 
differential banding. These produas have been separated by eletrophoresis on acr/lamide gels. Alongside each gel picture for each gene is an ideogram 
of the inferred sequence of each product, estimated by band size and the expected products predicted by junction read abundance from RNA-seq data. 
A single replicate from each genotype only was appraised, and these correspond to wild-type (WT) replicate 2 and thunder replicate 2 in Figure 5. 
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CDS CD45RC 
Figure 7 Expression of normally spliced Senp2 cDNA partially restores survival of Hnrpll""' T cells. CDS T cells from Hnrpll*^^ or Hnrpll"^"'' 

OT-1 TCR-transgenic B6.SJL-CD45.1/CD45.2 mice were activated and spinoculated with defective retroviral particles containing either empty 
retroviral vector or the same vector bearing full length, normally spliced Senp2 cDNA, and injected into the circulation of normal C57BL/5 mice, 
(a) Blood from each recipient mouse was analyzed by flow cytometry 3 and 1 7 days later, enumerating the frequency of donor CD45.1 + T cells 
among CDS cells at each timepoint and the percentage that survived between days 3 and 17. (b) Data from two independent experiments 
analyzed as in (a), (c) Staining of the donor-derived T cells for CD45RC: black and blue histograms, +/+ cells with empty or 5enp2 vectors, 
respectively; red histogram, thu/thu cells with 5enp2 vector. 



intron retention in total polyA + RNA and the presence of 
variably spliced exons, and by experimentally demonstrat- 
ing the action of a developmentally regulated, RNA- 
binding splicing silencer, hnRNPLL. Nuclear fractionation 
and sequencing of nascent, chromatin-associated tran- 
scripts and those from the nucleoplasm will provide a 
higher degree of sensitivity for detecting unspliced tran- 
scripts, but the high quantity of input material required is 
unfeasible for cells of limited abundance. The diversity of 
RBPs in mammalian cells allows strong cell-type-specific 
regulation of splicing, as we have found, and produces an 
imperative to compare purified cell populations, which will 
frequently mean that starting cells will be in limited supply. 

Along with this simple means of generating data we 
have developed a scoring and ranking scheme to identify 
genes displaying properties associated with regulated in- 
tron retention and alternative splicing. No single feature 
of exon inclusion, intron retention or splice junction 
usage was solely sufficient to reliably identify a group of 
transcripts as undergoing regulated late splicing, but a 



simple integration of evidence from RNA-seq data could 
better rank genes. From a set of six genes with RNA-seq 
and RT-PCR evidence for differential splicing between 
wild-type and thunder Hnrpll mRNAs, we identified that 
cryptic junction reads are a frequent feature of introns 
flanking alternative exons. Machine learning approaches 
are often used to take information about a training set of 
genes to extrapolate further similar cases across a large 
dataset. We applied a naive Bayesian classifier to use these 
six genes to identify further targets of hnRNPLL regula- 
tion. Perhaps due to the small number of training genes 
we found the classifier did not produce a list of genes with 
convincing evidence on visual inspection of the RNA-seq 
data in IGV. As more studies elucidate the features of al- 
ternatively spliced genes regulated by RBPs, a greater 
number of known, training genes may enable better use of 
machine learning algorithms. 

Regulated retention of introns flanking alternative 
exons in the Ptprc gene indicates that splicing decisions 
are frequently delayed until after transcription and 
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polyadenylation. One suggested role for post-transcriptional 
splicing is that it provides more (or later) opportunities 
for regulated alternative splicing [10]. The retained in- 
trons we have observed in the Ptprc gene do not appear 
to be 'dead-end' transcripts destined for degradation, 
but are likely exons on which a splicing decision is pend- 
ing. Speculatively, delayed processing of variable exons in 
Ptprc may provide a pool of mRNA that can be rapidly 
modified in response to changing extracellular signals. 
This paradigm is perhaps best demonstrated by the un- 
conventional cytoplasmic splicing of Hacl/Xbpl mRNA 
in response to endoplasmic reticulum stress [37], but has 
also been suggested for conventional alternative splicing 
[13]. While there is no direct evidence to support this pos- 
sibility for Ptprc, TCR-induced dephosphorylation of PSFl 
promotes Ptprc exon silencing [38]. Potentially, this pool 
of 'delayed transcripts' are awaiting a cellular decision on 
the splicing of alternative exons. In the absence of this de- 
cision introns surrounding the alternative exons remain 
unspliced, to allow the option of a splicing event that 
causes an exon to be skipped. 

The differentially spliced genes between wild-type and 
thunder T cells elucidate the primary or secondary tar- 
gets of hnRNPLL and identify a cellular program that is 
likely to be specifically regulated between naive and acti- 
vated/memory T cells. Both SR proteins and hnRNPs are 
encoded by large gene families and the diversity of these 
proteins allows highly tissue-specific splicing regulation 
by single RBPs [39]. Hence, we need not expect a large 
number of hnRNPLL-dependent targets. However, apart 
from these genes all being co-expressed in T cells, there 
are few annotations that obviously link them functionally 
as a larger program. Several pairwise functional associa- 
tions are clear, such as the well characterized direct 
interaction of CD45 with LCK to dephosphorylate pTyr- 
505 and pTyr-394 regulatory tyrosines on LCK [40,41]. 
One hypothesis to explain this apparent lack of a 
broader program may be that the true functional targets 
of hnRNPLL are indeed a very small set, perhaps being 
just one, two or three genes. Functional associations be- 
tween these genes will not be effectively captured 
through gene set enrichment analysis, as gene pairs or 
trios will be statistically very sensitive to missing or in- 
complete functional annotations. The differential exon 
usage in the Ptprc gene is the clearest signal seen in the 
RNA-seq data and the signal of the other genes is much 
less distinct, and decreases quickly to the limit of detec- 
tion through our positive gene list. It could be that genes 
with weaker signals represent minor or promiscuous 
binding targets of hnRNPLL without strong functional 
effect. Of the stronger targets, namely those in our first 
list of six high ranking gene targets, it will be important 
to investigate the functional impact of the splicing regu- 
lation mediated by hnRNPLL. We provide evidence here 



that inefficient splicing of one of these targets, Senp2, 
contributes to the short lifespan of hnRNPLL-mutant T 
cells. Senp2 encodes a sumo-specific protease and results 
in embryonic lethality when knocked out [31,32], but 
how it promotes T-cell longevity will await further study. 

Detection of genes with differentially retained introns 
from RNA-seq data suggests a method for mapping the 
targets and extended regulatory program of hnRNPs and 
potentially some SR proteins. This approach identifies 
both direct and secondary targets of RNA binding pro- 
teins, and hence would be highly complementary to add- 
itional sources of information from RIP-seq, PAR-CLIP 
and other immunoprecipitation followed by sequencing 
approaches. Collections of mice with knockout and mu- 
tant hnRNPs are steadily growing and transcriptome 
data taken from appropriate cells could be extracted, se- 
quenced and mined using a technically simple approach 
similar to that presented here. For example, hnRNPLL 
and hnRNPL have overlapping roles in regulating the 
splicing of Ptprc in T cells [42]. With existing mouse 
knockouts for hnRNPL, mapping of the functional tar- 
gets of RBPs could be undertaken similarly to this study, 
in combination with genome-wide RNA-binding assays. 

Conclusions 

Collectively, our data indicate that lymphocytes that ex- 
press wild-type hnRNPLL at high levels (wild-type T cells) 
exhibit a delay in splicing of the Ptprc introns surrounding 
those exons (exons 4 and 6) with multiple hnRNPLL bind- 
ing sequences, so that 8 to 20% of mRNA retains these in- 
trons after they have spliced out most other introns and 
become polyadenylated. This provides in vivo genetic 
evidence in mammalian cells that intron processing is 
delayed around alternatively spliced exons in nascent 
mRNA, whereas it often occurs co-transcriptionally and 
preceding polyadenylation for introns associated with con- 
stitutively spliced exons, consistent with emerging obser- 
vations in other systems [7,8,33]. 

Materials and methods 

RNA preparation 

Single cell suspensions of mouse spleens were prepared 
by sieving and gentle pipetting followed by lysis of red 
blood cells. CD8^ T cells were isolated to >90% purity 
(as assessed by flow cytometry) using mouse CD8a 
microbeads (Miltenyi Biotec Australia Pty. Ltd. NSW, 
Australia) following the manufacturer's instructions. 
Total RNA was then extracted using TRIzol (Invitrogen™ 
Life Technologies Australia Pty. Ltd. Victoria, Australia) 
and isopropanol precipitation. The purity and integrity 
of RNA was determined using the Agilent Bioanalyzer, 
which reported an RNA integrity number of >8.5 for 
all samples. CD19* B cells and CD4* T cells were puri- 
fied from spleens of wild-type C57BL6 mice using 
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microbeads conjugated with either anti-CD19 or anti- 
CD4 antibodies. 

Sequencing and mapping 

Sequencing was done using an Illumina GAIIx with a 
single-end length of 125 bp for OTl samples and 76 bp 
for non-OTl samples. RNA-sequencing reads were 
aligned to the mouse reference genome mm9 (NCBIM37) 
using TopHat (vl.4.1) with Bowtie by using default pa- 
rameters, which allows up to two mismatches [23]. Mean 
base coverage of an intron or exon was calculated as fol- 
lowing: Number of reads mapped x Read length/Feature 
length. Gene expression was estimated from all reads 
mapping to all exons annotated to a gene. A minimum 
average coverage threshold of 20 bases was used to desig- 
nate genes as expressed. The raw sequencing data have 
been uploaded to NCBI Short Read Archive under acces- 
sion number SRP034881. 

Detection of differential splicing 

Identification of differential inclusion of introns, exons 
and/or splice junctions was done using the DEXSeq pack- 
age (version 1.8.0 on R version 3.0.2) [28], which normal- 
izes count data for different coverage depths and 
estimates data dispersions, then tests for differential in- 
tron/exon/j unction usage between wild type and mutant 
using a generalized linear model. An input dataset of non- 
overlapping exonic parts of the mm9 genome was created 
by the script dexseq_prepare_annotation.py provided in 
the package based on the Ensembl transcript annotation 
file Mus_musculus.NCBIM37.66.gtf, which was used to 
create a table of intronic parts for multi-exon genes. For 
more sensitive detection of differential reads within long 
introns, introns were further divided into 300 bp windows. 
The dexseq_count.py script, also part of the DEXSeq 
package, was then used to count the number of reads that 
fall within each exon or intron. Analysis of differential 
exon and intron inclusion were performed on both fea- 
tures simultaneously with DEXSeq (Additional file 5). As 
most introns contain few or no reads and pooling the data 
in this manner allowed better normalization. Dispersion 
estimation and statistical model calculation was per- 
formed using the TRT method provided with DEXSeq on 
exons and introns with a minimum of 10 counts over all 
samples. A raw splicing junction count dataset was gener- 
ated using TopHat and was used in DEXSeq analysis 
to find differential splice junction usage for junctions of 
a minimum count of 2 over all samples. At a false discov- 
ery rate of 0.1, 558 out of 230,416 junctions were found to 
be significant. 

Calculation of intron retention ratio 

Intron retention ratios across 5 ' and 3 ' splice sites were 
calculated as the sum of reads mapped to 25 bp 



upstream and downstream of the splice site, and calcu- 
lated as the ratio of intronic reads to exonic reads, as 
previously described [7]. 

Retained introns and annotated alternative splicing 
events 

Exons associated with high 3'- or 5'SS read ratios were 
compared to alternative splicing events annotated to the 
Ensembl mouse gene set [26] (NCBIM37, Ensembl release 
67). Annotated alternative splicing event types included 
constitutive exon (CNE), cassette exon (CE), mutually ex- 
clusive exons (MXE), intron retention (IR), alternative 3' 
sites (A3SS), and alternative 5' sites (A5SS) after the no- 
menclature described in [43] . A frequency of each splicing 
event type was calculated for exons that had a 3'SS ratio 
>0.01 in wild-type T cells from OTl mice, then compared 
with 1,000 randomly permuted sets for each type of spli- 
cing event. P-values were calculated for a null hypothesis 
that the frequencies of splicing event types in exons hav- 
ing at least 10% of intron retention are not different to a 
background distribution, with an alternative hypothesis 
that the frequencies in exons of increased 3'SS IR is 
higher for event types CE, MXE, IR, A3SS, and A5SS and 
lower for the CNE type. 

Scoring scheme to rank differential intron retention 
events 

Six measures related to differential intron retention were 
calculated: differential exon inclusion, differential splice 
junction usage, differential intron retention, distinct 
cryptic splice sites (per intron) and 3 - and 5'SS ratios. 
Differential exon, intron and splice junction read tests 
were calculated with DEXSeq as described above and 
features were scored 'true' if they were below a signifi- 
cance threshold equivalent to a false discovery rate of 
less than 0.1. Cryptic junctions were scored as 'true' if 
they had more than one unannotated junction supported 
by three or more reads in at least one sample. 3'- and 5' 
SS ratios were calculated as described above and these 
ratios compared between cells types. Where ratios dif- 
fered by 0.1 or more, these were scored as 'true'. Across 
all six feature types true values were summed and in- 
trons ranked according to total score. 

PCR validation of candidate genes 

RNA samples from OT-1 CD8* T lymphocytes from 
spleen of wild-type and thunder mice were prepared as 
described above then reverse transcribed to produce 
first-strand cDNAs (Marligen Biosciences, Ijamsville, 
MD, USA). cDNA products were quantified with a 
Nanodrop Spectrophotometer and used for PCR amplifi- 
cation of candidate genes using Taq DNA polymerase 
with a 55°C annealing temperature and 35 cycles, which 
were then electrophoresed in 2% agarose gel or 4% 
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polyacrylamide gel, stained with GelRed or Syber Gold, 
respectively. Primer sequences used for validation of 
alternative splicing variants are shown in Table S4 in 
Additional file 4. 

OT-1 T cell transduction and transplantation 

CD8 T cells were purified by magnetic bead depletion of 
other lymphocyte subsets from spleens of OT-1 TCR 
transgenic B6.SJL-CD45.1 congenic mice of Hnrpll*^^ or 
Hnrplf''"^'^" genotype. The T cells were activated by cul- 
ture at 2 X 10 cells/ml with plate-bound anti-CD3 and 
soluble anti-CD28 antibodies for 24 h, and then spinocu- 
lated with defective retroviral particles packaged in 
Phoenix cells containing either empty pMIG II vector or 
the same vector bearing full length normally spliced 
Senp2 cDNA. The T cells were placed back in culture 
with anti-CD3 and anti-CD28 for 24 h, washed, and 
injected into the lateral tail vein of normal C57BL/6 
mice. Each recipient mouse was bled 3 and 17 days later, 
blood cells stained for CDS, CD45.1 and CD45RC, and 
analyzed by flow cytometry. 

Additional files 
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